11 research outputs found
Multidialectal acoustic modeling: a comparative study
In this paper, multidialectal acoustic modeling based on shar-
ing data across dialects is addressed. A comparative study of
different methods of combining data based on decision tree
clustering algorithms is presented. Approaches evolved differ
in the way of evaluating the similarity of sounds between di-
alects, and the decision tree structure applied. Proposed systems
are tested with Spanish dialects across Spain and Latin Amer-
ica. All multidialectal proposed systems improve monodialectal
performance using data from another dialect but it is shown that
the way to share data is critical. The best combination between
similarity measure and tree structure achieves an improvement
of 7% over the results obtained with monodialectal systems.Peer ReviewedPostprint (published version
First experiments on an HMM based double layer framework for automatic continuous speech recognition
The usual approach to automatic continuous speech recognition is what can be called the acoustic-phonetic modelling approach. In this approach, voice is considered
to hold two different kinds of information acoustic and phonetic . Acoustic information is represented by some kind of feature extraction out of the voice signal, and phonetic information is extracted from the vocabulary of the task by means of a lexicon or some other procedure. The
main assumption in this approach is that models can be constructed that capture the correlation existing between
both kinds of information.
The main limitation of acoustic-phonetic modelling in speech recognition is its poor treatment of the variability
present both in the phonetic level and the acoustic one. In this paper, we propose the use of a slightly modified framework where the usual acoustic-phonetic modelling
is divided into two different layers: one closer to the voice signal, and the other closer to the phonetics of the sentence. By doing so we expect an improvement of
the modelling accuracy, as well as a better management of acoustic and phonetic variability. Experiments carried out so far, using a very simpli ed version of the proposed framework, show a signi cant improvement in the recognition of a large vocabulary continuous speech task, and represent a promising start point for
future research.Peer ReviewedPostprint (published version
Joint training of codebooks and acoustic models in automatic speech recognition using semi-continuous HMMs
In this paper, three different techniques for building semicontinuousHMMbased
speech recognisers are compared:
the classical one, using Euclidean generated codebooks and independently trained acoustic models; jointly reestimating
the codebooks and models obtained with the classical method; and jointly creating codebooks and models growing their size from one centroid to the desired number
of them. The way this growth may be done is carefully addressed, focusing on the selection of the splitting direction and the way splitting is implemented. Results in a large vocabulary task show the ef ciency of the approach, with noticeable improvements both in accuracy and CPU consumption. Moreover, this scheme enables the use of the concatenation of features, avoiding the independence assumption usually needed in semi-continuous HMM modelling, and leading to further improvements in accuracy and CPU.Peer ReviewedPostprint (published version
Multidialectal acoustic modeling: a comparative study
In this paper, multidialectal acoustic modeling based on shar-
ing data across dialects is addressed. A comparative study of
different methods of combining data based on decision tree
clustering algorithms is presented. Approaches evolved differ
in the way of evaluating the similarity of sounds between di-
alects, and the decision tree structure applied. Proposed systems
are tested with Spanish dialects across Spain and Latin Amer-
ica. All multidialectal proposed systems improve monodialectal
performance using data from another dialect but it is shown that
the way to share data is critical. The best combination between
similarity measure and tree structure achieves an improvement
of 7% over the results obtained with monodialectal systems.Peer Reviewe
Joint training of codebooks and acoustic models in automatic speech recognition using semi-continuous HMMs
In this paper, three different techniques for building semicontinuousHMMbased
speech recognisers are compared:
the classical one, using Euclidean generated codebooks and independently trained acoustic models; jointly reestimating
the codebooks and models obtained with the classical method; and jointly creating codebooks and models growing their size from one centroid to the desired number
of them. The way this growth may be done is carefully addressed, focusing on the selection of the splitting direction and the way splitting is implemented. Results in a large vocabulary task show the ef ciency of the approach, with noticeable improvements both in accuracy and CPU consumption. Moreover, this scheme enables the use of the concatenation of features, avoiding the independence assumption usually needed in semi-continuous HMM modelling, and leading to further improvements in accuracy and CPU.Peer Reviewe
Multi-dialectal Spanish speech recognition
Spanish is a global language, spoken in a big number of different countries with a big dialectal variability‥ This paper deals with the suitability of using a single multi-dialectal acoustic modeling for all the Spanish variants spoken in Europe and Latin America. This paper deals with the suitability of using a single multi-dialectal acoustic modeling for all the Spanish variants spoken in Europe and Latin America. The objective is two fold. First, it allows to use all the available databases to jointly train and improve the same system. Second, it allows to use a single system for all the Spanish speakers. The paper describes the rule- based phonetic transcription used for each dialectal variant, the selection of the shared and the specific phonemes to be modeled in a multi-dialectal recognition system, and the results of a multi-dialectal system dealing with dialects in and out of the training set.Peer Reviewe
Joint training of codebooks and acoustic models in automatic speech recognition using semi-continuous HMMs
In this paper, three different techniques for building semicontinuousHMMbased
speech recognisers are compared:
the classical one, using Euclidean generated codebooks and independently trained acoustic models; jointly reestimating
the codebooks and models obtained with the classical method; and jointly creating codebooks and models growing their size from one centroid to the desired number
of them. The way this growth may be done is carefully addressed, focusing on the selection of the splitting direction and the way splitting is implemented. Results in a large vocabulary task show the ef ciency of the approach, with noticeable improvements both in accuracy and CPU consumption. Moreover, this scheme enables the use of the concatenation of features, avoiding the independence assumption usually needed in semi-continuous HMM modelling, and leading to further improvements in accuracy and CPU.Peer Reviewe
Multi-dialectal Spanish speech recognition
Spanish is a global language, spoken in a big number of different countries with a big dialectal variability‥ This paper deals with the suitability of using a single multi-dialectal acoustic modeling for all the Spanish variants spoken in Europe and Latin America. This paper deals with the suitability of using a single multi-dialectal acoustic modeling for all the Spanish variants spoken in Europe and Latin America. The objective is two fold. First, it allows to use all the available databases to jointly train and improve the same system. Second, it allows to use a single system for all the Spanish speakers. The paper describes the rule- based phonetic transcription used for each dialectal variant, the selection of the shared and the specific phonemes to be modeled in a multi-dialectal recognition system, and the results of a multi-dialectal system dealing with dialects in and out of the training set.Peer Reviewe
Multidialectal acoustic modeling: a comparative study
In this paper, multidialectal acoustic modeling based on shar-
ing data across dialects is addressed. A comparative study of
different methods of combining data based on decision tree
clustering algorithms is presented. Approaches evolved differ
in the way of evaluating the similarity of sounds between di-
alects, and the decision tree structure applied. Proposed systems
are tested with Spanish dialects across Spain and Latin Amer-
ica. All multidialectal proposed systems improve monodialectal
performance using data from another dialect but it is shown that
the way to share data is critical. The best combination between
similarity measure and tree structure achieves an improvement
of 7% over the results obtained with monodialectal systems.Peer Reviewe